Identifying Compiler and Optimization Level in Binary Code From Multiple Architectures

نویسندگان

چکیده

While compiling a native application, different compiler flags or optimization levels can be configured. This choice depends on the requirements. For example, if application binary is intended for final release, and settings should set execution speed efficiency. Alternatively, to used debugging purposes, debug configured accordingly, usually involving minor no code optimization. However, this information cannot easily extracted from compiled binary. Nonetheless, ensuring same compilation particularly important when comparing files, avoid inaccurate unreliable analyses. Unfortunately, understand which optimizations have been used, deep knowledge of target architecture required. In study, we present two learning models detect both level in The study are O0, O1, O2, O3, Os x86_64, AArch64, RISC-V, SPARC, PowerPC, MIPS, ARM architectures. addition, x86_64 AArch64 architectures, also determine whether GCC Clang. We created dataset more than 76000 binaries it training. Our experiments showed over 99.99% accuracy detecting between 92% 98%, depending architecture, level. Furthermore, analyzed change amount data was extremely limited. shows that possible accurately flag with function-level granularity.

برای دانلود رایگان متن کامل این مقاله و بیش از 32 میلیون مقاله دیگر ابتدا ثبت نام کنید

اگر عضو سایت هستید لطفا وارد حساب کاربری خود شوید

منابع مشابه

Optimization and Code Generation in a Compiler for Several Machines

This paper describes Optimization techniques that have been implemented in a compiler which was designed to produce code comparable to that produced by hand. Additional optimization methods were incorporated into successive versions of the compiler. It MJUS found that no single method was effective with all compiled programs but that each of the techniques described was effective for some progr...

متن کامل

Compiler Technology for Migrating Sequential Code to Multi-threaded Architectures

Executing sequential code in parallel on a multithreaded machine has been an elusive goal for many years. It has recently become quite important due to the widespread introduction of multi-cores in PCs. Automatic multi-threading could not be achieved so far because classic compiler analysis was not powerful enough and program behavior was found to be in many cases input dependent. Run time, spe...

متن کامل

High-Level Code Optimization

Software systems are inherently complex. Building large software systems has proved so difficult precisely because of the complexity levels with which programmers have to deal. In [7] Brooks divides complexity in essential and accidental and argues that solutions which worked in other fields cannot apply to software. Essential complexity stems from very nature of software (i.e. the large number...

متن کامل

Identifying Multiple Authors in a Binary Program

Knowing the authors of a binary program has significant application to forensics of malicious software (malware), software supply chain risk management, and software plagiarism detection. Existing techniques assume that a binary is written by a single author, which does not hold true in real world because most modern software, including malware, often contains code from multiple authors. In thi...

متن کامل

An Instrumenting Compiler for Enforcing Confidentiality in Low-Level Code

We present an instrumenting compiler for enforcing data confidentiality in low-level applications (e.g. those written in C) in the presence of an active adversary. In our approach, the programmer marks secret data by writing lightweight annotations on top-level definitions in the source code. The compiler then uses a static flow analysis coupled with efficient runtime instrumentation, a custom ...

متن کامل

ذخیره در منابع من


  با ذخیره ی این منبع در منابع من، دسترسی به آن را برای استفاده های بعدی آسان تر کنید

ژورنال

عنوان ژورنال: IEEE Access

سال: 2021

ISSN: ['2169-3536']

DOI: https://doi.org/10.1109/access.2021.3132950